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Complex networks have attracted increasing interest from various fields of science. It 
has been demonstrated that each complex network model presents specific topolog- 
ical structures which characterize its connectivity and dynamics. Complex network 
classification rely on the use of representative measurements that model topological 
structures. Although there are a large number of measurements, most of them are 
correlated. To overcome this limitation, this paper presents a new measurement for 
complex network classification based on partially self-avoiding walks. We validate 
the measurement on a data set composed by 40.000 complex networks of four well- 
known models. Our results indicate that the proposed measurement improves correct 
classification of networks compared to the traditional ones. 

Keywords: complex networks, deterministic walks, networks classification, networks 
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I. INTRODUCTION 



Due to the development of informatics, the acquisition of huge data sets has become pos- 
sible. The analysis of these huge data sets migrated from few individual components (nodes) 
to a huge number of components all of them interconnected (links) ^ . It has been noticed that 
many systems have physically close nodes highly connected and distant nodes are weakly 
connected. As a consequence, a new type of topological structure emerged^ This new 
structure interpolates the regular lattice and the random (Erdos and Renyi) one^'^. This 
change of paradigm has made the junction of the mathematical graph formalism (associated 
to finite systems) with the statistical physics analysis in a new type of topological structure, 
which has been named complex network. It has been realized that the complex networks can 
describe several types of topological structures of our daily lives ranging from informatics 
systems (computer connections in www or internet pages referencing) to biology (protein 
structures^, metabolic networks) passing through social systems (scientific citation^, actors 
network, disease and rumor propagation, linguistics^ etc.), and pattern recognition^^' 

To use complex networks formalism in a given system, one must firstly specify which 
parts of the system form the nodes and how these nodes are interconnected. It has been 
shown that completely different real systems may share a common topological structure. 
Thus, one is concerned in how to differentiate them through topological measurements^^. 

For a robust network classification, representative measures must be extracted. The 
problem is how to define a set of measures that is the most appropriate for a specific appli- 
cation. Several measures have been proposed such as the average number of connections of 
a node^*^'^^, hierarchical degree^*^'^^, clustering coefficient^"', assortativity etc. Nevertheless, 
many of these measures are correlated, leading to redundancy^^. Although optimal results 
are not guaranteed, the use of statistical methods (such as principal component analysis or 
linear discriminant analysis) to select and improve the measure set is an alternative to solve 
redundancy and subjective selection^^. 

Here we propose to consider networks, with each link having a weight (weighted network) 
and to use of a new way to classify them. We propose to use agents that leave from each 
node and go to the closest neighbor (the smallest link weight) that has not been visited in 
the preceding /i time steps. This partially self-avoiding deterministic rule can be modified 
so that the agent goes to the furthest (the strongest link weight, instead of the closest 
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one) neighbor at each time step. Although the partially self-avoiding walk rules are simple, 
the agent trajectories are complicated and lead to an effective exploration of the network. 
The closest (furthest) neighbor rule produces trajectories that explore locally (globally) 
the network. In this study, the partially self-avoiding walks have been applied to classify 
four kinds of networks: Erdos-Renyi, geographical, small-world and scale-free. This simple 
procedure allows our method to achieve results in network classification that are better 
than the traditional ones. We call attention that a similar technique has been successfully 
employed in the classification of image texture^^. 

This paper is organized as follows. In Sec. II, a brief review of the results of partially 
self-avoiding walks is presented. In Sec. Ill, we describe the partially self-avoiding walk 
methods for network classification. Numerical experiments and results are presented in 
Sec. V. Finally, in Sec. VI, concluding remarks and possible extensions of the method are 
presented. 

II. PARTIALLY SELF- AVOIDING WALK 

Random walks in regular or random environment have been extensively studied^^'^^. Nev- 
ertheless, deterministic walks in regular^^'^^ or random^^ environments also present interest- 
ing results. These results can be applied to a whole variety of practical situations such as: im- 
age analysis^^'^^'^^, pattern recognition^^, fractaP^, thesaurus dictionaries^^, optimization^''', 
etc. 

For instance, consider a partially self-avoiding deterministic walk, where an agent wishes 
to visit N points randomly distributed in a map of d dimension. These points can be 
considered as sites and the agent can move from one to another following the rule of, at each 
discrete time step, going to the nearest site not visited in the previous /i steps. The agent 
performs a partially self-avoiding walk, where the self-avoidance is limited to the memory 
window T = /X — 1. Although the dynamical deterministic rule is simple, the agent trajectory 
can be very complicatcd^^'^^. 

The agent movements depends strictly on the data set configuration and on the starting 
gj^g28,29 xhey are entirely performed based on a neighborhood table, so that the distances 
among the sites are simply a way of ranking their neighbors. This feature leads to an 
invariance in scale transformations^^. Starting from given point, the trajectory starts with 
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the agent visiting preferentially new points and ends in a cycle, where the same sequence of 
P > A* + 1 points are visited (see Fig. 1). The beginning of the trajectory has a transient 
time t and the ending cycle has a period of p or a p-attractor. Notice that t and/or p 
are different for different starting points so that a transient time and attractor period joint 
distribution S^^J{t,p) can be computed for the generated trajectories. One obtains the 
same trajectories for the same point configuration regardless the scale one is dealing. 

The most trivial case to deal with the deterministic agent is to consider fj, — 0. The 
agent remains in the same site and the trajectory has null transient time and an attractor 
with period p = 1. The transient time cycle period joint distribution is simply given by: 
'^(fdi^^P) ~ ^t,o^p,i, where 6ij is the Kroneckcr delta. Despite its triviality, this becomes 
interesting because it is the simplest situation of a stochastic version of this partially self- 
avoiding walk^°"^^. 

For a memoryless agent {/i — 1), at each time step, the agent must leave the current site 
and go to the nearest one. After a very short transient time, the agent becomes trapped 
by a couple of mutually nearest neighbors. The transient time and period joint probability 

distribution, for ^ 1, can be analytically calculated'^^: S[°^\t,p) = T{1 + I^^){t + 
Id^)Sp,2/^{t + p + I^^), where T{z) is the gamma function and Id = /i/4[l/2, {d + l)/2] is 
the normalized incomplete beta function. In the limit d ^ oo, one is able to calculate it 
analytically^^: 

When greater values of n are considered, the cycle distribution is no longer peaked at 
Pmin = A* + 1) but presents a whole spectrum of cycles with period p > Pmin^^'^^'^^~^^ ■ 

Another possible way to deal with these partially self-avoiding deterministic walks is to 
consider a two dimensional lattice and randomly distribute random weights to the (// -|- 1)^^ 
nearest neighbors so that a given agent, with memory /x, explore the lattice. This system is 
depicted in Fig. 1. 



III. COMPLEX NETWORK MODELING 



Below we describe how this partially self-avoiding walks can be used to classify networks. 
We start presenting how to perform these walks on networks. Next we use the transient 
time cycle period joint distribution to create a vector that characterizes the topology of the 
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FIG. 1. Regular lattice with random weights assigned to the first and second closed neighbors. An 
agent leaves a given site and moves according to the rule of not returning to the last /i visited sites. 
The trajectory is composed of a transient time and attractor are orange and green, respectively. 

network. 

A. Partially Self-avoiding Walks on Networks 

Consider a network represented by a set of n vertices V = {vi, ...,Vn} and a set of links 
W = {w Vi,Vj\'Vi,Vj G V,Wy^^y. G 3?}, where each component w^-^y^ represents the link weight 
that connects vertex Vi to vertex Vj. Yet, ri{v) = {A; G 1^13^^^,} is a set such that all elements 
are neighbor of the vertex v and T = {to^^i) • • • ,ti\tj G V} is a vertex list that store the 
agent trajectory. 

The agent starts the trajectory in a vertex vo,to = vq. Following the rule of going to 
the nearest vertex that has not been visited in the previous fi steps, the agent build his 
trajectory. On networks, the movements taken by the agent are completely performed with 
respect to the set of link weights W and the memory (Eq.2). The memory Mj is a subset 
composed of the last /i visited vertices of the trajectory T 



Since weight equalities may occur and the agent does not know where to go, one creates 
the set: 



to represent a set of vertices, so that these vertices are the closest (with respect to the 
weights) to the given vertex ti and do not belong to the memory set Mj. Depending on the 




(2) 



k=i~n 



C = {v EV\ arg min Wy^ti\v 4- 



(3) 



chosen movement rule, at each time step, the agent leaves its vertex and go to the nearest 
one (defined by argmin) or furthest one (switching argmin por argmax in Eq. 3). The 
trajectory is iterated by Eqs. 3 and 4. 

To solve possible equalities in C, if C has only one element, this is chosen as the following 
vertex to the agent. Otherwise, the equality is solved by a function that returns only a 
vertex. This function may return a vertex randomly chosen or execute a more sophisticated 
operation 

^ ( , ^ V (4) 

I 0(C), otherwise 

After a transient time t, the agent is trapped in an attractor with period p. The trajectory 
can be iterated up to a determined number of steps and it searches for an attractor, or at 
each time step, determine if an attractor has been reached and finish the trajectory. The 
attractor detection is defined by 



End ^ 



C = 

3Ep,it = 0, 0<p,tt<z 



(5) 



considering (f{v) as the vertex index v, i.e. <f{vi) = 1,99(^2) = 2. If Epit — 0' attractor 
with period p and transient time t = it — 1 have been detected. If C = 0, the agent has 
not found any attractor and p = with t = i. An efficient computational strategy to find 
attractors can be found in Ref.^^. 



B. Signature Vector 

The transient time and cycle period joint distribution S^J{t,p) stores a great quantity of 
information concerning the partially self-avoiding deterministic walk in a given environment. 
To effectively use these walks, relevant information must be extracted from S^^J{t,p). This 
relevant information is stored in a signature vector ipfj_^din, where din represents the dynamics 
adopted, for instance, the agent going to the nearest or furthest vertex. 

An important issue raised about the partially self-avoiding deterministic walk concerns 
the movement rule din adopted by the considered agent. On one hand, agents guided for 
the shorter distance are appropriate to find attractors in regions with high homogeneity. On 
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{&)din — max 




10 20 30 40 50 60 70 



{h)din = mini 

FIG. 2. Transient time t and attractor period p joint distributions obtained from the application 
of agents with different movement rules on geographical networks with N = 1000 and (A;) = 20. In 
a geographical network, the vertices are random and uniformly distributed through a square box 
and the weight of each edge is proportional to the distance between the vertices. In (a), the walker 
chooses to go to the closest site, while in (b), the walker goes to the furthest one. 

the other hand, agents guided for the highest distance find attractors located in regions with 
low homogeneity. The use of different movement rules reflect in different joint distributions, 
allowing the use of information from various sources in the environment characterization^^. 
Trajectories produced by different rules on a graph have distinct patterns for the same graph, 
as can be seen in Fig. 2. In this work, only two movement rules were used: din = max, the 
agent moves to the furthest vertex and din = min, the agent moves to the nearest vertex. 

The signature vector ipn^din is supposed to characterize the environment where the walks 
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have been performed. To compose the signature vector, an interesting strategy is to built 
a histogram /i/i,cim(^ + from the joint distribution. This histogram represents the 

number of walks that has a length of {t + p), where t and p are the transient time and period 

of the attractor, respectively. From the histogram, n descriptors are used to compose the 
signature vector ip^^din, 

V'Mm = [h,^,din{l^ +1), hfj,^din(lJ' + 2), h fj,^din{t + P) , ■■; hfj,,dinil^ + n)] 

h,M^^ + l) = slSiO,fi+l) . (6) 

KM^^ + 2) = slSio, ^^ + 2) + s^Sih + 1) 

KMl^ + 3) = ^^2^(0, // + 3) + Sl%\l, // + 2) + (2, + 1) 

The first descriptor is on position + 1, because there is no smaller period. 

The joint distribution depends on the value of fi and the movement rule din. To 

capture information from different sources and scales, a signature vector ip consisting of the 
concatenation of V'^^dm with multiples /i values and different movement rules din is built: 

The algorithm for the proposed measurement is presented below. First, walks are per- 
formed on a complex network C with different memories (/^i, and movement rules 

din. Thus, a joint distribution 5*1^^2 obtained. For each joint distribution, a histogram is 
calculated and a feature vector ip^^^din is built. Finally, the vector of each value of memory 
and movement rule are concatenated, obtaining a final feature vector with information from 
various sources and scales. 



IV. ANALYSIS OF THE PROPOSED METHOD 

In this section, we present an analysis of the proposed method with regard to the complex 
network features, such as average degree and number of vertices. In Fig. 3, histograms of 
walk length for complex network models built using N = 50000 and vertex degree mean 
(k) — 5 are presented. For purpose of comparison, each column represents an iteration of 
the same model and each row represents a complex network model. It is possible to note that 
the histograms present distinct patterns and therefore, we can conclude that the walks can 
be used to characterize each complex network model. Table I shows four statistics calculated 
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from the histograms. From this table, it is shown that the mean of the walk lengths on the 
scale-free model is the longest. It occurs because most vertices have few connection, which 
makes difficult to identify attractors. On the other hand, the mean of the walk lengths on the 
geographical networks is the shortest. In geographical networks, the vertices are connected 
in proportion to spatial distance, which helps to formation of groups of connected vertices 
and thus the identification of attractors. 
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FIG. 3. Histograms of walk length for different complex network models built using N = 50000 
and vertice degree mean (k) = 5. 
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Statistics of Histograms 


Model 


Mean 


Standard Deviation 


Entropy 


Skewness 


Small- world 


27.83 


24.12 


-5.97 


1.72 


Erdos-Renyi 


85.57 


57.42 


-7.66 


0.64 


Geographical Network 


17.88 


10.50 


-5.19 


1.08 


Scale- free 


106.02 


40.59 


-7.16 


-0.18 



TABLE I. Statistics of the histograms for different complex network models. 



Figure 4 illustrates the effect on the histograms for N ranging from 10000 to 50000 while 
keeping (k) — 50 on the small-world model. The histograms are similar as we increase the 
number of vertices. For the small- world network, the histograms show a peak on walks of 
small length and decay as the walk length is increased. As an example, the mean, standard 
deviation, entropy, skewness and kurtosis of the histograms are given in Table II. 





Statistics of Histograms 


N 


Mean 


Standard Deviation 


Entropy 


Skewness 


10000 


27.89 


23.76 


-5.97 


1.59 


20000 


25.05 


19.02 


-5.80 


1.39 


30000 


26.98 


22.09 


-5.93 


1.66 


50000 


26.52 


21.96 


-5.89 


1.71 


100000 


26.28 


20.95 


-5.88 


1.64 



TABLE II. Statistics of the histograms for different values of A'' on the small-world model. 

The histograms of walk lengths for different values of (k) are shown in Figure 5. As we 
increase the values of (k), the peak of walks with short length is smoothened. For high 
values of (k), the vertices become more connected and it undermines the traveller in search 
of an attr actor. Then, walks with longer lengths can be generated more frequently. Table 
III presents the histogram statistics for different values of (k) — 20. 

10 



0.05 



0.05 



0.05 



CQ 
5 



CQ 
5 



CQ 
5 



200 400 
Walk Length 



200 400 

Walk Length 

(a)7V = 10000. 



200 400 

Walk Length 



0.05 



CQ 
5 




0.05 



CQ 
5 




0.05 



CQ 
5 




200 400 
Walk Length 



200 400 

Walk Length 

{h)N = 20000. 



200 400 

Walk Length 



0.05 



CQ 



0.05 



CQ 



0.05 



CQ 



200 
Walk Length 



400 



200 
Walk Length 

{c)N = 30000 



400 



200 400 
Walk Length 



0.05 



CQ 




0.05 



CQ 






0.05 






CQ 




5 


o' 




200 400 200 400 200 400 

Walk Length Walk Length Walk Length 

{d)N = 50000 

FIG. 4. Histograms of walk length for number of vertices varying from = 10000 to 50000 on the 



Small-world model. 



V. EXPERIMENTS AND RESULTS 



To evaluate partially self-avoiding deterministic walks as a complex network measure- 
ment, experiments were performed on a data set composed by 40.000 artificial networks. 
The complex network models considered in this data set include: Erd5s-Renyi, small-world, 
geographical network and scale-free. The data set consists of artificial complex networks 
built with random weight for the edges, number of vertices ranging from N = 100 to 1000 
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FIG. 5. Histograms of walk length for degree mean varying from (A;) = 10 to 50 on the Small-world 
model. 

increasing by steps of 100, and average degree ranging from (A;) = 2 to 20 increasing by steps 
of 2. For the geographical network, the vertices were random and uniformly distributed in- 
side a square box. 

As in Ref.^^, n = 4 descriptors from the histogram have been used to compose the 
signature vector. The statistical analysis reveals that relevant information is concentrated 
only in the first few elements. The signature vector from each complex network were 
classified using KNN classifier^^ [k = 1) in a 10-fold cross-validation strategy. Since our 
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Statistics of Histograms 


(k) 


Mean 


Standard Deviation 


Entropy 


Skcwness 


10 


30.86 


23.11 


-6.14 


1.35 


20 


41.53 


31.96 


-6.61 


1.83 


30 


48.61 


33.84 


-5.83 


0.91 


50 


58.01 


39.68 


-7.07 


0.76 



TABLE III. Statistics of the histograms for different values of (k) on the small-world model. 

focus is on modeling, we use a simple classifier rather than a more sophisticated classifier 
such as support vector machines which have been shown to produce superior results but 
requires more tuning of parameters. 

A. Parameters Evaluation 

An evaluation of parameters of the partially self-avoiding deterministic walks is presented 
in the following. Classification results for different values of n and movement rule din are 
presented in Table IV. In most cases, movement rule min achieved better classification 
results than rule max. The former found attr actors in homogeneous and local regions, i. e. 
regions where the edge weights are low. Figure 6 shows the Principal Component Analysis 
(PCA) projection considering two dimensions for both movement rules. As we can see, 
the potential of the movement rule min to obtain separated clusters is evident from this 
example. Another important result is that the concatenation of both rules {[min max]) 
increased the correct classification rate. This is because the strategy keeps local and global 
information from the complex network, providing a powerful framework to complex network 
characterization . 





Memory (fi) 







1 


2 


3 


4 


5 


min 


74.31 


87.27 


73.86 


64.45 


61.59 


58.45 


max 


76.2 


81.26 


70.98 


63.21 


58.94 


55.76 


[min max] 


84.07 


93.02 


84.03 


74.84 


68.37 


61.09 



TABLE IV. Correct classification rate for ip/^^din with different values of fj, and movement rule din. 
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(a)max. (b)niin. 

FIG. 6. PCA projection for 4.000 networks obtained by using Erdos-Renyi, geographical network, 
small-world and scale-free. Networks were built with N = 1000 and (k) = 20 and the walks were 
performed with /i = 5. In (a), the walker chooses to go to the closest site, while in (b), the walker 
goes to the furthest one. 

Results from Table IV also showed that for both movement rules, the correct classification 
rate is decreased as the memory is increased, except for /i = which is the trivial case of 
the deterministic walk. These results lie on the fact that a walk has more difficulty, as 
the memory increases, in finding an attractor in the image^^. On the other hand, small 
values of memory /i perform a better local analysis of the network structure, resulting in 
an higher correct classification rate. As an illustration. Figure 7 presents the first two PCA 
discriminant for different values of memory. 

Results using signature vectors composed by the concatenation of multiple values of 
memory are presented in Table V. This strategy diminishes the importance of individual 
values of fi and allows walks with a higher range of lengths, thus providing more robust 
characterization. In Figure 8, PCA projection of signature vectors composed by multiple 
values of /i (0, 1, 2, 3, 4, 5) is shown. 

B. Multivariate Analysis of Variance 

Here, we determine whether the means of the feature vectors differ significantly among 
the four classes of complex networks. To answer this question, we apply the multivariate 
analysis of variance (MANOVA)'^'^. The MANOVA takes a set of grouped data composed 
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FIG. 7. PC A projection for complex network models built with = 1000 and (A;) = 20 using 
deterministic walks with different values of memory and din = [min,max]. 



by the features extracted using the partially self-avoiding walks characteristics. First, we 
generate 1000 complex networks for each class (random network, scale-free, geographical and 
small- world) . For each complex network, from the histogram of the partially self-avoiding 
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FIG. 8. PCA projection of signature vectors composed by the concatenation of memories 0, 1, 2, 3, 4 
and 5. The signatures vectors were extracted from complex networks built with N = 1000 and 
{k) = 20. 

walks and extract the feature vector as in the previous experiments. Thus, one has a data 
matrix of 4000 rows (1000 samples for each class) and 60 columns (features). Figure 9 
depicts the scatter plot matrix for the first four features. On the bottom panel, the figure 
shows the box plots for different classes. 

Considering the means of the four classes by fii, ^2, fi'3, fJ"i, we perform the MANOVA to 
test the hypothesis that ;Ui = = /^s = /^4- The test of homogeneity (equality) of variance 
is performed. Once the variances are equal, the means are tested and the hypothesis of equal 
means is rejected even at the 1% significance level. 

In Figure 10, we generate a dendrogram of the class means after the MANOVA. To 



Memories (^) 


min 


max 


[min max] 


{0,1} 


95.49 


94.23 


98.11 


{0,1,2} 


95.65 


94.47 


98.16 


{0,1,2,3} 


95.73 


94.68 


98.19 


{0,1,2,3,4} 


95.66 


94.82 


98.21 


{0,1,2,3,4,5} 


95.65 


94.83 


98.22 



TABLE V. Correct classification rate for ip composed by the concatenation of values fi. 



16 





+ 


Random 




Scale-free 


o 


Geographical 


+ 


Small-vorid 



0.1 0.2 0.3 0.1 0.2 0.3 0.4 0.5 



0.3 



0.2 



0.1 





1 


1 

1 — 1 






1 

^ -r 




1 
1 



0.5 

P 0.4 
I 

OJ 

:s 0-3 

o 

^ 0.2 
0.1 





^ " 


-t 




1 

r 


i 

1 
1 
1 


+ 




1 
1 





Random Scale-free Geographical Small-world 



Random Scale -free Geographical SmaU-world 



FIG. 9. Scatter plot matrix for the first four features obtained from 4000 complex networks. 



obtain the dendrogram, the single hnkage method is apphed in the matrix of Mahalanobis 
distances between class means. The ordenate represents the distance in which two classes is 
connected. In this way, one observes that the random network and the geographical network 
produce features with the most similar characteristics. Also, the scale-free is the network 
which produce the most different features from the other three networks. These results is 
corroborated by the plots of Figure 9. 



17 



700 



600 - 



500 - 

(D 
O 

I 400 - 

■ 1—1 

300 - 



200 - 



100 - 



Random Network Geographical Small-World Scale -free 



FIG. 10. Dendrogram of the class means using the multivariate analysis of variance (MANOVA) 
takes a set of grouped data composed by the features extracted using the partially self-avoiding 
walks characteristics of 1000 complex networks for each class (random network, scale-free, geo- 
graphical and small- world) . 

C. Correlation with Traditional Measures 

The network classification obtained with the partially self-avoiding deterministic walks 
are compared to the traditional methods. The results of these comparisons are shown 
in Table VI. For each traditional measurement for each vertex, the mean value and its 
standard deviation are estimated. Results of in Table VI indicate that the classification rate 
improves with our proposed method from 78.32% to 98.24% over the coefficient clustering 
measurement. The next highest rate of 71.87% is obtained for Pearson correlation, where 
the classification is done with Decision Tree classifier. 

An interesting strategy for classifying complex networks involves the concatenation of 
signature vectors extracted by different measurements. In this work, we considered two 
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Measurement 


Knn Naive Bayes 


Decision Tree 


Degree 


66.72 


25.64 


67.77 


Hier. Degree 2 


54.67 


28.45 


64.47 


Hier. Degree 3 


54.14 


32.17 


64.33 


Weighted Degree 


43.49 


25.63 


54.72 


Weighted Hier. Degree 2 


51.98 


28.50 


62.61 


Weighted Hier. Degree 3 


53.27 


32.11 


63.51 


Clustering Coefficient 


68.13 


69.19 


78.32 


Pearson Correlation 


67.02 


58.96 


71.87 


Proposed Measurement 


98.22 


80.12 


98.24 



TABLE VI. Comparison between measurements extracted from the complex networks. 

concatenations: a) concatenation of all traditional measurements and b) concatenation of 
the proposed measurement and all traditional ones. Table VII presents the experimental 
results obtained by the combinations on the complex network model classification. When all 
traditional measurements are concatenated, the result obtained only by the proposed mea- 
surement is still equivalent (99.18% against 98.22% using Knn classifier and 99.10% against 
98.24% using Decision Tree classifier). The combination between traditional measurement 
and the proposed one achieved the highest correct classification rate of 99.96%. 

In principle, signature vector composed by more measurements can provide a better 
complex network classification. This fact suggests that each network model has specific 
topological features modeled by different traditional measurements. Experimental results 
obtained by the proposed measurement suggest that this new measurement can model most 
of the topological features of the complex network models. The proposed method results 
are even more important since the concatenation of an excessive number of measurements 
may affect the quality of complex network classification^^. 

VI. CONCLUSION 

Here, we have presented a new method to classify complex network based on self-avoiding 
deterministic trajectories generated by agents leaving from each site of a given weighted 
network. The agent movements are given by a rule which may characteristics is to forbid 
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Measurement 



Knn Naive Bayes Decision Tree 



Traditional Measurements 



99.18 



80.96 



99.10 



Proposed Measurement 



98.22 



80.12 



98.24 



Traditional and Proposed Measurement 99.96 



81.54 



99.92 



TABLE VII. Comparison between combination of measurements. 



visitation to vertex recently visited, with a memory The trajectories formed by these 
agents, have a transient time and finish in a cycle with an attractor with a given period. 
Form the transient time and attractor period joint distribution, we build a signature vector. 
It is with this vector that one classifies the networks types. So that, the agents explore 
and characterize the complex network topology. Combining walks performed with different 
values of /i and din, it is possible to include information from different sources and scales 
and then improve the complex network characterization. 

Promising results have been obtained on a data set composed by 40.000 networks of 
4 well-known models. Experimental results indicate that the proposed method improves 
recognition of models compared to the traditional measurements. In addition, our method 
makes the modeling of complex network feasible and simple, since only the use of this new 
measurement is enough to obtain satisfactory results of classification. 
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